Boosting Localized Classifiers in Heterogeneous Databases
نویسندگان
چکیده
Combining multiple global models (e.g. back-propagation based neural networks) is an effective technique for improving classification accuracy. This technique reduces variance by manipulating the distribution of the training data. In many large scale data analysis problems involving heterogeneous databases with attribute instability, standard boosting methods can be improved by coalescing multiple classifiers. Each classifier uses different germane attribute information that is identified through the attribute selection process. We propose a new technique of boosting localized classifiers when heterogeneous data sets contain more homogeneous data distributions. Instead of a single global classifier for each boosting round, we have localized classifiers responsible for each homogeneous region. The number of regions is identified through a clustering algorithm performed at each boosting iteration. A new boosting method applied to real life spatial data and synthetic spatial data shows improvements in prediction accuracy when unstable driving attributes and heterogeneity are present in the data. In addition, boosting localized experts significantly reduces the number of iterations needed for achieving the maximal prediction accuracy.
منابع مشابه
Adaptive boosting techniques in heterogeneous and spatial databases
Combining multiple classifiers is an effective technique for improving classification accuracy by reducing the variance through manipulating the training data distributions. In many large-scale data analysis problems involving heterogeneous databases with attribute instability, however, standard boosting methods do not improve local classifiers (e.g. k-nearest neighbors) due to their low sensit...
متن کاملImproving reservoir rock classification in heterogeneous carbonates using boosting and bagging strategies: A case study of early Triassic carbonates of coastal Fars, south Iran
An accurate reservoir characterization is a crucial task for the development of quantitative geological models and reservoir simulation. In the present research work, a novel view is presented on the reservoir characterization using the advantages of thin section image analysis and intelligent classification algorithms. The proposed methodology comprises three main steps. First, four classes of...
متن کاملAdaptive Boosting for Spatial Functions with Unstable Driving Attributes
Combining multiple global models (e.g. back-propagation based neural networks) is an effective technique for improving classification accuracy by reducing a variance through manipulating training data distributions. Standard combining methods do not improve local classifiers (e.g. k-nearest neighbors) due to their low sensitivity to data perturbation. Here, we propose an adaptive attribute boos...
متن کاملGenetic Programming of Heterogeneous Ensembles for Classification
The ensemble classification paradigm is an effective way to improve the performance and stability of individual predictors. Many ways to build ensembles have been proposed so far, most notably bagging and boosting based techniques. Evolutionary algorithms (EAs) also have been widely used to generate ensembles. In the context of heterogeneous ensembles EAs have been successfully used to adjust w...
متن کاملA Hybrid Framework for Building an Efficient Incremental Intrusion Detection System
In this paper, a boosting-based incremental hybrid intrusion detection system is introduced. This system combines incremental misuse detection and incremental anomaly detection. We use boosting ensemble of weak classifiers to implement misuse intrusion detection system. It can identify new classes types of intrusions that do not exist in the training dataset for incremental misuse detection. As...
متن کامل